Skip to content

direct: store serialized_dashboard/serialized_space in state as content hashes#5609

Draft
shreyas-goenka wants to merge 10 commits into
databricks:mainfrom
shreyas-goenka:shreyas-goenka/dashboards-sha-state
Draft

direct: store serialized_dashboard/serialized_space in state as content hashes#5609
shreyas-goenka wants to merge 10 commits into
databricks:mainfrom
shreyas-goenka:shreyas-goenka/dashboards-sha-state

Conversation

@shreyas-goenka

Copy link
Copy Markdown
Contributor

Changes

The direct deploy engine persists the full planned config per resource in resources.json. For dashboards and genie spaces that includes the inlined serialized_dashboard / serialized_space contents, which routinely run from hundreds of KB to several MB (and roughly double once JSON-escaped into state). That content is never read back from state:

  • drift is detected via etag (both fields are ignore_remote_changes, etag_based), so the remote serialized value is never meaningfully compared;
  • a deploy always sends the contents to the API from the plan's new_state, never from saved state;
  • nothing resolves references out of these fields.

So the only thing the saved value is used for is a local "did the content change since last deploy?" equality check — which a hash serves exactly.

This PR adds an optional CompactState(state *T) (*T, error) adapter method (same idiom as RemapState / OverrideChangeDesc) that replaces such equality-only fields with a sha256:<hex> placeholder. The framework applies it both before persisting state and to every value entering the diff (saved state, the local config copy, and the remapped remote), so stored and compared values share one canonical form: unchanged content still yields an equal-hash skip, changed content yields a different hash, exactly as before.

dashboards.serialized_dashboard and genie_spaces.serialized_space implement it. The plan's new_state (sent to the API on apply) and the raw top-level remote_state snapshot keep their full content.

Compatibility

No state version bump. Legacy full-content state is hashed on read for the comparison and rewritten compactly on the next save (lazy migration). An older CLI reading new state sees a hash, plans one redundant update, and rewrites full content — safe. Bumping the version would instead hard-fail older CLIs, a worse failure mode for mixed-version CI/teams.

User-visible effect

bundle plan reports these fields as sha256:... in the changes section rather than embedding the (potentially multi-MB) serialized blob. resources.json shrinks correspondingly, as does the per-deploy state upload.

Field selection

A field qualifies only if it is ignore_remote_changes, is never read back from state, and is large enough to matter. Surveying all direct-engine resource types, only these two fields qualify today; the mechanism is declarative-by-method so a future file-inlined blob can opt in by implementing CompactState. A unit test guards the core invariant (the field must be ignore_remote_changes).

Tests

  • Unit: hashStateValue (determinism, idempotence, nil/empty), CompactState for both resources, and the ignore_remote_changes invariant guards.
  • Acceptance: regenerated affected direct-engine plan/state outputs (dashboards simple/detect-change/unpublish-out-of-band, genie inline, bind, migrate). The genie_spaces/version_migration script previously parsed the schema version out of the plan's serialized content; it now asserts local_remote_differ + the etag_based skip, which is the behavior it was really demonstrating.

This pull request and its description were written by Isaac.

…nt hashes

The direct deploy engine persists the full planned config per resource in
resources.json. For dashboards and genie spaces, that includes the inlined
serialized_dashboard / serialized_space contents, which routinely run into the
hundreds of KB to several MB. These fields are never read back from state: drift
is detected via etag (they are ignore_remote_changes), and a deploy always sends
the contents from the plan's new_state, never from saved state.

Add an optional CompactState adapter method that replaces such equality-only
fields with a "sha256:<hex>" content hash. The framework applies it both before
persisting state and to every value entering the diff, so stored and compared
values share one form and unchanged content still produces an equal-hash skip.
The plan's new_state (sent to the API) and the raw remote_state snapshot keep
full content.

No state version bump: legacy full-content state is hashed on read for the
comparison and rewritten compactly on the next save.

Co-authored-by: Isaac
…together

Extend the genie_spaces/serialized_space test so a single deploy asserts both
sides of the SHA-only-state contract on the same state:
  - the create requests carry the full serialized_space (existing capture, plus
    an explicit "no sha256 in the request" guard), and
  - the persisted state stores only the sha256 content hash, never the content.

Co-authored-by: Isaac
The two error paths added for state compaction in Bind (adapter lookup and
CompactState) returned without removing the temp state file, unlike every other
error path in the function. Add os.Remove(tmpStatePath) to match.

Co-authored-by: Isaac
…s one token

The generic test replacements split the persisted sha256 digest into pieces:
[0-9a-f]{32} turned it into "[DASHBOARD_ID][DASHBOARD_ID]" in the bind test, and
[0-9]{3,} fragmented it as "sha256:...[NUMID]..." in several plan outputs. Add a
single low-Order replacement in acceptance/bundle/test.toml that collapses the
64-char digest to "sha256:[ALPHANUMID]" before those rules run, so every state and
plan output renders the hash consistently. Also add the missing newline after the
state-assertion title in the genie serialized_space test.

Co-authored-by: Isaac
serialized_space gets the same treatment in a follow-up; keeping this PR to
dashboards only.

Co-authored-by: Isaac
…aration

The per-resource CompactState(state any) (any, error) adapter method (plus its
IResource entry and calladapt plumbing) is removed. Instead, fields persisted as
a content hash are declared under hashed_in_state in resources.yml, and a generic
dresources.CompactState(cfg, state) hashes them via structaccess.

Behaviour is unchanged: new_state still carries the full content (so deploy and
deploy --plan send the real bytes), and only resources.json + the diff use the
hash. dashboards.serialized_dashboard is the sole declared field; acceptance
outputs are identical.

Co-authored-by: Isaac
Spell out that hashing the saved state on read is what lets state written before
this change (or before a field was added to hashed_in_state) still diff correctly
against the now-hashed local config, and that it is read-only normalization (no
state_version bump). Link the PR.

Co-authored-by: Isaac
…nvariant test

- Drop the CompactState call on the remapped remote state in CalculatePlan. Hashing
  only ever helps the local diff (saved vs config); a hashed_in_state field is always
  ignore_remote_changes, so its remote diff is discarded regardless. The plan now shows
  the real remote value instead of a meaningless remote hash.
- Mask the state hash in acceptance output as "sha256:[HASH]" instead of the generic
  "[ALPHANUMID]", so it clearly reads as a content hash.
- Add TestHashedInStateImpliesIgnoreRemoteChanges enforcing, for every resource, that
  any hashed_in_state field is also ignore_remote_changes (a hashed field can only
  detect local changes; its remote diff must be ignored or it would drift forever).

Co-authored-by: Isaac
# ignore_remote_changes (etag_based) and re-sent from config on every deploy, so it
# is never read back from state; persist only its content hash to keep state small.
- field: serialized_dashboard
reason: large_content

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

omit reason here. It does not show up in plan.

Hash the remapped remote on hashed_in_state fields too, so a hashed field is a hash
on all three sides of the diff (saved, config, remote). Once the saved value is a
hash, every comparison must be hash-vs-hash — including remote drift (Remote vs New)
— or it is hash-vs-content nonsense. With the remote hashed, remote drift is detected
as hash != hash, so a field can be hashed_in_state without being ignore_remote_changes.

The two are now declared independently. serialized_dashboard needs both, but for
unrelated reasons: hashed because the blob is large, ignore_remote_changes because the
server normalizes it (so its remote hash never matches the config hash). Dropped the
test that wrongly required hashed_in_state to imply ignore_remote_changes.

Co-authored-by: Isaac
…ndtrip)

Add dashboard-state-sha: a direct-only test (terraform does not hash state) that runs
both in-memory (READPLAN="") and saved-plan (READPLAN=1, deploy --plan) deploys and
asserts:
  - the create request carries the FULL serialized_dashboard (out.create.serialized.json,
    identical across READPLAN, so the saved plan applies the real content, not the hash);
  - the persisted state holds only the content hash (out.state.direct.txt);
  - the plan keeps full content in new_state but reports the diff as a hash;
  - a re-plan with no local change is a no-op (remote normalization ignored, etag_based).

Local diff detection (edit -> update) is covered by change-serialized-dashboard.

Co-authored-by: Isaac
@github-actions

Copy link
Copy Markdown
Contributor

An authorized user can trigger integration tests manually by following the instructions below:

Trigger:
go/deco-tests-run/cli

Inputs:

  • PR number: 5609
  • Commit SHA: 85af285ff8ab3a32c11ba1dc630f8b83099cdb11

Checks will be approved automatically on success.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant